NetNews Offline 2

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Offline 2 / NetNews Offline Volume 2.iso / news / comp / lang / c-part2 / 13308 < prev next >

Wrap

Text File | 1996-08-05 | 2.6 KB | 52 lines

Newsgroups: comp.lang.c,comp.lang.perl Path: netcom.com!crewman From: crewman@netcom.com (Kevin L. Weller) Subject: Re: Random File Access - I don't get it Message-ID: <crewmanDpEEAy.LI6@netcom.com> Sender: crewman@netcom2.netcom.com Organization: Me, Myself, and I X-Newsreader: Forte Free Agent 1.0.82 References: <3160DE1E.495C@teleport.com> Date: Fri, 5 Apr 1996 16:41:46 GMT Scott Kinard <kinards@teleport.com> wrote: > Suppose I have two files, one contains text (1 line of random length text per >'record') and another which is my index into the text file, which contains the starting >and ending byte positions in the text file for each 'record'. Now the question.. >Suppose the text file has 2,000,000 entries. Now, what's the difference in reading >1,000,000 lines from the index file to find the starting and ending byte positions in >the text file for record 1,000,000, and then seeking these in the text file, rather than >just reading 1,000,000 times from the text file to get the same 'record' in the first >place? There may not be much advantage if the index is not sorted, unless the index records are very small. Chances are, though, if there is an index, it IS sorted. Depending on HOW it's sorted, you can either: 1) Use a divide-and-conquer approach (a binary search) on the index to find the index record (average time complexity base 2 log n), then perform one read from the text file (so on average, your 2,000,000-record example would take about base 2 log 2,000,000 + 1, or approximately 22 reads, a h*ll of a lot better than 1,000,000!). Somebody please correct my math if it's off.... :-) 2) If the index is organized by a hashing function, the whole process should take only two reads (assuming no key collisions): one to hash into the index, and the other to read the text file. Either case is tremendously better than the average n/2 reads (2,000,000 / 2 = 1,000,000 in this case) a sequential search of the text file would require. Be cool! ------------------------------------------------------------------------------ Kevin L. Weller, a.k.a. Crewman!!! /-------+--------------------\ crewman@netcom.com | aTm | GIG 'EM, AGGIES! | This is my personal (non-business) account \-------+--------------------/ ------------------------------------------------------------------------------ %SYS-E-BADOPSYS, Fatal system error, DEC VMS halting \ All wiyht. Rho sritched -SYS-I-GETUNIX, Replace with UNIX immediately! \ mg kegtops awound? ------------------------------------------------------------------------------